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Abstract: We extend the definition of a Stochastic Hybrid Automaton (SHA) to overcome 
limitations that make it difficult to use for on-line control. Since guard sets do not specify 
the exact event causing a transition, we introduce a clock structure (borrowed from timed 
automata), timer states, and guard functions that disambiguate how transitions occur. In the 
modified SHA, wc formally show that every transition is associated with an explicit clement of 
an underlying event set. This also makes it possible to uniformly treat all events observed on a 
sample path of a stochastic hybrid system and generalize the performance sensitivity estimators 
derived through Infinitesimal Perturbation Analysis (IPA). We eliminate the need for a case- 
by-case treatment of different event types and provide a unified set of matrix IPA equations. 
Wc illustrate our approach by revisiting an optimization problem for single node finite-capacity 
stochastic flow systems to obtain performance sensitivity estimates in this new setting. 
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1. INTRODUCTION 

A Stochastic Hybrid System (SHS) consists of both time- 
driven and event-driven components. Its stochastic fea- 
tures may include random transition times and external 
stochastic inputs or disturbances. The modeling and op- 
timization of these systems is quite challenging and many 
models have been proposed, some capturing randomness 
through probabilistic resets when reset functions are dis- 
tributions, through spontaneous transitions at random 
times [Bujorianu and Lygeros, 2006],[Hespanha, 2004], 
Stochastic Differential Equations (SDE) [Ghosh et al., 
1993], [Ghosh et al., 1997], or using Stochastic Flow Models 
(SFM) [Cassandras et al., 2002] with the aim of describing 
stochastic continuous dynamics. 

Optimizing the performance of SHS poses additional chal- 
lenges and most approaches rely on approximations and/or 
using computationally taxing methods. For example, [Bu- 
jorianu and Lygeros, 2004] and [Koutsoukos, 2005] resort 
to dynamic programming techniques. The inherent com- 
putational complexity of these approaches makes them 
unsuitable for on-line optimization. However, in the case 
of parametric optimization, application of Infinitesimal 
Perturbation Analysis (IPA) [Cassandras et al., 2002] to 
SHS has been very successful in on-line applications. Using 
IPA, one generates sensitivity estimates of a performance 
objective with respect to a control parameter vector based 
on readily available data observed from a single sample 
path of the SHS. Along this line, SFMs provide the most 
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common framework for applying IPA in the SHS setting 
and have their root in making abstractions of complex 
Discrete Event Systems (DES), where the event rates 
are treated as stochastic processes of arbitrary general- 
ity except for mild technical assumptions. A fundamental 
property of IPA is that the derivative estimates obtained 
are independent of the probability laws of the stochastic 
processes involved. Thus, they can be easily obtained and, 
unlike most other techniques, they can be implemented in 
on-line algorithms. 

In this paper, we aim at extending Stochastic Hybrid Au- 
tomata (SHA) [Cassandras and Lygeros, 2006] and create 
a framework within which IPA becomes straightforward 
and applicable to arbitrary SHS. A SHA, specifies discrete 
states (or modes) where the state x evolves according to a 
continuous vector field until an event triggers a mode tran- 
sition. The transitions are described by guards and invari- 
ants as well as clock structures [Bujorianu and Lygeros, 
2006] borrowed from Stochastic Timed Automata (STA) 
[Glynn, 1989], [Cassandras and Lafortune, 2006]. When x 
reaches a guard set, a transition becomes enabled but not 
triggered. On the other hand, if x exits the boundary 
defined by the invariant set in a mode or if a spontaneous 
event occurs, the transition must trigger. This setting has 
the following drawbacks: (i) The clock structure (normally 
part of the system input) is not incorporated in the def- 
inition of guards and invariant conditions. As a result, 
spontaneous transitions have to be treated differently (ii) 
The guard set does not specify the exact event causing a 
transition and we cannot, therefore, differentiate between 
an event whose occurrence time may depend on some 
control parameter 6 and another that does not. This is 
a crucial point in IPA, as it directly affects how perfor- 



mance derivative estimates evolve in time. As described in 
[Cassandras et al., 2009], when applied to SFMs, IPA uses 
a classification of events into different types (exogenous, 
endogenous, and induced) to extract this information. 

Here, we seek an enriched model which explicitly specifies 
the event triggering a transition and, at the same time, 
creates a unified treatment for all events, i.e., all IPA 
equations are common regardless of event type. We achieve 
this by introducing state variables representing timers and 
treating a clock structure as an input to the system with 
the mode invariants generally dependent on both. We 
formalize the definition of an event by associating it to 
a guard function replacing the notion of the guard set. 
This removes the ambiguity caused by an enabled, but 
not triggered, event, as well as the need for treating spon- 
taneous transitions differently. A byproduct of this unified 
treatment is the development of a matrix notation for 
the IPA equations, which makes the treatment of complex 
systems with multiple states and events a straightforward 
application of these equations. We verify this process by 
applying it to a single node queueing system previously 
solved using the SFM framework [Cassandras et al. , 2002] . 

The paper is organized as follows: In Section 2, we present 
the general SHA which includes all the features previously 
handled by SFMs. Utilizing the resulting model, in Section 
3 we develop a matrix notation for IPA which simplifies 
the derivation of sensitivity estimators in Section 4. We 
verify our results in Section 5 by applying the proposed 
technique to a single node finite capacity buffer system. 
We conclude with Section 6. 

2. GENERAL OPTIMIZATION MODEL 

Let us consider a Stochastic Hybrid Automaton (SHA) 
as defined in [Cassandras and Lafortune, 2006] with only 
slight modifications and parameterized by the vector 9 = 

(e 1 ,e 2 ,...,e Ne ) e o as 

G = (Q, X,U,Q,£,f,(j), Inv, guard, r, (q , x )) 

where 

• Qc Z + is the countable set of discrete states or modes; 

• X(9) C R " is the admissible continuous state space 
for any 9 E O; 

• U{9) C K. " is the set of inputs (possibly disturbances 
or clock variables) for any 9 £ <d; 

• O C R^ 8 is the set of admissible control parameters; 

• £ is a countable event set £ = {Ei, i = 1, 2, ... , N e }; 

• f is a continuous vector field, f : Q x X(9) x U{9) x 
6 4 1(0); 

• <j> is a discrete transition function <j) : Q X X(9) x U(9) x 

£^Q;^ 

• Inv is a set defining an invariant condition such that 
InvCQx X(9) x U{8) x 6; 

• guard is a set defining a guard condition, guard CQx 
Q x X{9) x U(9) x 0; 

• r is a reset function, r : Q x Q x £ x X{9) x U{9) x 
6 ^X{9); 

• (<7oj x o) is the initial state. 

Note that the input u E U{9) can be a vector of random 
processes which are all defined on a common probability 
space (51, J 7 , P). Also, observe that invariants and guards 
are sets which do not specify the events (hence, precise 



times) causing violation or adherence to their set condi- 
tions. Thus, we cannot differentiate between a transition 
that depends on 9 and one that does not. This prevents 
us from properly estimating the effect of 9 on the system 
behavior. In particular, if (x, u) E guard{q,q') for some 
q, q' € Q, a transition to q' E Q can occur either through 
a policy that uniquely specifies some (x, u, 9) causing the 
transition or at some random time while (x, u, 9) remains 
in the guard set. This is one of the issues we focus on in 
what follows. 

We allow the parameter vector 9 to affect the system not 
only through the vector field, reset conditions, guards, and 
invariants, but also its structure through X{9) and IA{9), 
e.g., the parameters can appear in the state and input 
constraints. We remove 9 from the arguments whenever 
it does not cause any confusion and simplifies notation. 
Defining x(£, 9) and u(i, 9) as the state and input vectors, 
we introduce the following assumptions: 

Assumption 1. With probability 1, for any x E X, q E 
Q,ueU, and 9 E 6, ||f (g, x, u, 0)||<» < o° where || • Hoc is 
the I/qo norm. 

Assumption 2. With probability 1, no two events can 
occur at the same time unless one causes the occurrence 
of the other. 

Assumption 1 ensures that x(£, 9) remains smooth inside 
a mode as is embedded in the definition of the SHA above. 
Assumption 2 rules out the pathological case of having two 
independent events happening at the same time. 

Borrowing the concept of clock structure from Stochastic 
Timed Automata as defined in [Cassandras and Lafortune, 
2006], we associate an event Ei E £ with a a sequence 
{Vi ! i(9),Vi > 2(9), . . .} where Vi, n {9) is the nth (generally 
random) lifetime of Et, i.e., if this event becomes feasible 
for the nth time at some time t, then its next occurrence is 
at t + Vi t „,(9). Obviously, not all events in £ are defined to 
occur in this fashion, but if they are, we define a timer 
as a state variable, say yi, so that it is initialized to 
Ui(t) = Vi >n {9) if Ei becomes feasible for the nth time 
at time t. Subsequently, the timer dynamics are given by 
i)i — —1 until the timer runs off, i.e., y.i(t + Vi^ n {9)) = 0. 
Figure 1 shows an example of a timer state as it evolves 
according to the supplied event lifetimes. We assume that 




Fig. 1. A timer realization based on a given clock sequence {Vi,„} 

Vi >n (9) is differentiable with respect to 9 for all n. 

The concept of event. In the SHA as defined above, 
£ is simply a set of labels. We provide more structure to 
an event by assigning to each Ei E £, a guard function 
g t : R + xXxUxO^R which is not null (i.e. g t ^ 0) 
and is assumed to be differentiable almost everywhere on 
its domain. We are interested in a sample path of a SHA 
G on some interval [0,T] where we let Tk{9) be the time 



when the fcth transition fires and set = tq < T\(9) < 
. . . < tk{0) < tk+i = T. We then define an event Ei as 
occurring at time T k (9) if 

T fe = inf{£ > r k -i : g t (t, x, u, 9) = 0} 

that is, the event satisfies the condition gi(t, x, u, 9) = 
which was not being satisfied over (Tk-i(9) , Tfc(0)) . The 
following theorem shows that using guard functions, we 
can associate every transition with an event occurring at 
the transition time. 

Theorem 1. For every STA G with £ , X and Q, there 
exists another STA G with event set £ , continuous state 
space X and discrete state space Q such that Q = Q and 
every transition (g, q') in G is associated with an event 
e G £ at the transition time. 

Proof: A transition (q, q') is dictated by the transition 
function (j)(q, x, u, e) such that (f>(q, x, u, e) = q' for some 
x G X{6), u G U(6), e G £. If 0(g, x, u, £,) = </>(<?, £;) = q' 
for some -E; G £, i.e., the transition (q,q') is independent 
of x G <-f (0), u G U(9), the proof is complete. In this 
case, we can always augment x G X to x = (x, t/j) G 
,¥ where j/j is a timer state variable capturing lifetimes 
of event Ei G £ and associate £"i with guard function 
&(i,X,U,0) = j/j. If 0(g,x,u,e) = </>(c?,x,u) = q', 
i.e., the transition (q,q') depends on (x, u, 0), then it 
is either a result of violating Inv(q) or it occurs while 
(x, u, 9) G guard(q,q'). In the former case, we can define 
some Ei such that g%(t, x, u, 9) — is the condition that 
determines the occurrence time of E; L . This is because 
Inv(q) can be violated in two ways: (a) directly, due to 
an occurrence of E{ meaning (x, u) is on the boundary 
of Inv(q) at the transition time; (b) indirectly, due to a 
reset condition which is the result of a previous transition 
4>{q' , x, u, e) = q, where it is possible that q' = q (a 
self- loop transition). That is, the reset condition is such 
that r(q',q,e,x,u,9) £ Inv(q). In case (a), gi(-) is such 
that gi(t, x, u, 9) = is part of the boundary of Inv(q) 
including (x, u). In case (b), the transition can only occur 
as (i) a result of some e G £ (completing the proof); (ii) 
the violation of Inv(q'); or (Hi) while (x, u) G guard(q',q). 
Since we have already considered the first two cases, we 
only need to check case {Hi), including q' = q. This 
case can occur either through (A) a policy equivalent to 
a condition g{t, x, u, 9) — 0; or (B) after some random 
time. In the former case, we can define some Ei such that 
gi{t, x, u, 9) = g(t, x, u, 9) — and include E% G £. In case 
(B), let r = inf{£ > T fc _! : (x(t),u(£)) G guard{q',q)} be 
the time that (x, u) enters guard(q' ,q). We can associate 
a self-loop transition (g', q') at r which is caused by some 
event Ei with guard function <?,-(•) such that <?,(t, x, u, 9) — 
0. Note that (x, u) satisfying this condition forms part of 
the boundary of guard(q,q'). We define a reset condition 
for this transition such that a timer with state y gets 
initialized with a random value Vi{9) such that r + Vi{9) 
is the time of transition {q 1 , q). We can then include y in x 
and define the event at r + V{ {9) as E,j G £ and associate 
with it a guard function gj {t, x, u, 9) = yj. Since such 
events Ei, Ej can always be defined, the proof is complete. 
■ 

In the above proof, we turn the reader's attention to how 
timers and guard functions remove the need for guard sets. 



Also, in the case of a chain of simultaneous transitions, 
they identify an event whose guard function determines 
when the transitions occur. 

Example. To illustrate this framework based on events 
associated with guard functions, consider the example of 
a SHA as shown in Fig. 2(a). This models a simple flow 
system with a buffer whose content is x{t, 0) > 0. The 
dynamics of the content are x{t, 9) = when q = 
(empty buffer) and x{t, 9) = a(t, 9) — j3, otherwise. Here, 
(3 > is a fixed outflow rate and {a{t, 6)} is a piecewise 
differentiable random process whose behavior depends on 
9 via a continuous vector field f a (t,6). We allow for 
discontinuous jumps in the value of a{t,0) at random 
points in time modeled through events that occur at time 
instants V\,V\ + Y%, . . . using a timer state variable y(t) 
re-initialized to V n +± after the nth time that y(t) = 0. 
The result of such an event at time t is a new value 
a(t + ) = A n+1 where this jump process is independent 
of 9. In mode q — 0, the invariant condition a(t, 9) < (3 is 
required to ensure that the buffer remains empty. 

The state of the new SHA, shown in Fig. 2(b), is denoted 
by x = (a, x, y) where note that y{t) = — 1. The event 
set is £ = {E 1 ,E 2 ,E 3 } with gi{t, x, u, 9) = a(t, 9) — 
(3, g 2 (£,x,u,0) = x{t,6), and g 3 {t,x,u,9) = y(t,9). 
In addition, we define the reset condition r{0,0,E 3 ) = 
r(l, 1, , E 3 ) = {A n+ i,x, V n +i) whenever E 3 occurs for the 
nth time, treating {A n } and {V n }, n = 1,2,..., as input 
processes. 

When q = 0, two events are possible: (?) If E\ occurs at 
t = Tfc we have a(rk, 6) — ft = and a transition to q = 1 
occurs since the condition a{t,8) — f3 < must have held 
at t^ . (ii) If E 3 occurs at t = Tu we have y(rk) = 
and a self-loop transition results. By the reset condition, 
Oi(r^) = A n+ x for some A n+ i, assuming this was the nth 
occurrence of this event. If A n+ i > j3, then immediately a 
transition to q — 1 occurs. Observe that even though the 
condition of this transition is o^t^ ) > /3, the transition is 
still due to event E 3 . 
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£2, [x-0] 



(b) Proposed SHA 

Fig. 2. A simple fluid buffer SHA: Contrasting two approaches. 

When q = 1, all three events are possible, but E\,E 3 cause 
self loops. If Ei occurs at some t — Tk, then x(rk) = 0. 
On the other hand, suppose that a transition to q = 1 
occurred because of E\ at t — Tk, i.e., a(rk, 9) — (3 = 0. It 
is possible that a(t, 9) — (3 = for t G [7^, Tk + e], e > 0. 



In this case, x — a(t,9) — (3 — and x(t£) = 0. This 
violates the invariant condition [x > 0] at q = 1, causing 
an immediate return to q = 0. Similarly, if a(rk, 9) — f3 = 
but a(T^,8) — j3 < 0, the invariant condition at q = 1 is 
violated and there is an immediate return to q = 0. All 
these can be summarized in the transition functions below: 

, ,„ -. ( 1 ii e = E 1 or a(t, 9) > f3 

0(O,x,u,e)= otherwise 



(l,x,u,e) = 



if e = S 2 or z(i, 0) = 

1 otherwise 



where u = a. 



It is important to note that the conditions [a > 0\ , [x = 0] 
have different meanings in Figs. 2(a), (b). In the former, 
they define the guard set conditions. As confirmed by 
Fig. 2(a), the guard set conditions (e.g. , [a > /3] at q = 0) 
cannot differentiate between (i) a smooth transition from 
the invariant [a < f3] to [a > j3] , hence, a transition to q = 
1 and (ii) a jump in a(t) that causes [a(£ + ) > f3\ to become 
true without satisfying a(t) — (3. Recall that the former 
depends on 0, whereas the latter does not. On the other 
hand, the set conditions on the transitions in Fig. 2(b) have 
a different meaning: they identify a condition caused by 
an event occurring at the same time but in a previous 
transition. Thus, [a > /3] is clearly associated with a jump 
in a in the previous transition and is independent of 9. If 
we are to control 9 to affect the systems's performance, it 
is obviously crucial to identify transitions that depend on 
it as opposed to ones that do not. 

Another byproduct of using the guard functions is that un- 
like the conventional IPA approach which normally cate- 
gorizes the events into exogenous, endogenous, and induced 
classes and derives different equations to capture their 
dependence on parameter vector 9, guard functions enable 
us to treat all events uniformly. Moreover, inclusion of 
timers in the states eliminates the need for a spontaneous 
event and the ambiguous notions of "enabled" events and 
"waiting" in a guard set. To show that the framework 
above encompasses the event classification in [Cassandras 
et al., 2009] for a SFM, we give a simple definition of each 
event class in terms of guard functions: 

• Exogenous Events: In [Cassandras et al., 2009], an event 
E,i G £ is defined as exogenous if it causes a transition 
at time Tk independent of 9 and satisfies the gradient 



condition 



<JT k 

(1.8 



0. In our case, we define Ei occurring 



at t = Tk as exogenous if the associated guard function 
<7i(rfe, x, u, 9) = <7j(rfc,x, u) is independent of 9 so that 

9 k de' U ' = 0. In our framework, an exogenous event 
is associated with a timer state variable whose guard 
function is independent of 9. 

• Endogenous Events: In contrast to an exogenous event, 
an endogenous one is such that g Tk ^g' u ' ^ 0. 
This includes cases where <7;(Tfc,x(rfc,0),u(Tfc, 6), 6) = 
yi(j~k, 9) = 0, for some timer state j/j. 

• Induced Events: In [Cassandras et al., 2009], an event 
Ei G £ occurring at t = Tk is called induced if it is caused 
by the occurrence of another event (the triggering event) 
at r m < Tk (m < k). In our case, an event is induced if 
there exists a state variable x associated with Ej G £ for 
which the following conditions are met: 



T m = max{£ < T k : x(t, 0) ^ 0, Vi G (r m , r fc )}. (la) 

T k = min{t > T m : x(t, 6) = 0}. (lb) 
In this case, the active guard function at Tk is 

gj { Tk , x{T k ,e),u{Tk,e),6) = x(Tk,0) - 0. (2) 

Generally, the initial value of state x(r m , 6) is determined 
by a reset function associated with transition m. After 
the reset, the dynamics x = f(t, 6) can be arbitrary until 
x(t, 6) = is satisfied. In this sense, the timer events are 
simple cases of induced events with trivial dynamics. 

As already mentioned, it is possible to have simultaneous 
transitions. This is a necessary condition to have chatter- 
ing in the SHA sample path which is mostly undesirable. 
To ensure a bounded number of transitions in the interval 
[0,T], let us introduce the following assumption: 

Assumption 3. With probability 1, the number of 
simultaneous transitions is finite. 

Since the number of transitions occurring at different 
times is finite over a finite interval [0,T], Assumption 3 
ensures that the total number of events observed on the 
sample path is finite. Depending on the system in question 
Assumption 3 translates into conditions on the states, 
parameters and inputs (see Assumption 5 for the case 
example in Section 5). We do not give the conditions under 
which Assumption 3 is valid in a general SHS setting. More 
on this can be found in [Simic et al., 2000] and standard 
references on control of hybrid systems. 

2.1 The optimization problem 

Observing the SHS just described over the interval [0,T], 
we seek to solve the following optimization problem: 

(P) ff* = argmin e J(T, 9) = E w [L(T, 6, w)] , 

subject to 

x(i, 9, u) = f(q{t),t, x(t, 9, u),u(t, 9, u),6), 

x G Inv(q),u G U(9) 

x(r fe h , 9, w) = r(q k -i,qk, x(rfe, 6, w), u(r fe , 9, w), 9) 
q(t)e{l,...,N q }, k = l,...,K(w) 
x(O,0,w) =x , 

where q(t) = qk when t G [Tfc,Tfc+i) and L(T,9,u>) is a 
sample function generally defined as 

L(T,0,u)= [ t(q{t),t,x{t,e,co),u(t,9,co),0)dt (3) 
./o 

for some given function £(■). Notice that although it is 
possible to treat time t as a continuous state variable, we 
make the dependence of various function on t explicit and 
do not include it in x as it makes our analysis easier to 
follow. 

We solve problem (P) using IPA. The objective of IPA is 
to specify how the changes in 9 affect x(t,9,oj) and ulti- 
mately, to calculate — ^' e — . This is done by finding the 
gradients of state x and event times Tk{9),,k = 1,...,K, 
with respect to 9. It has been shown that under mild 
smoothness conditions the result is an unbiased estimate of 
the objective function gradient ' W [Cassandras et al., 
2009]. Thus, coupling the sensitivity estimates with a 
gradient-based optimization algorithm can optimize the 
system performance. 



3. UNIFIED IPA APPROACH 



3.1 Matrix Notation 



Thus, as mentioned before, in order to determine the 
sample cost gradient with respect to 9, one needs to find 
the event time and state derivatives with respect to it. 



Let v(t, 9) be a scalar function which is diffcrentiable 
with respect to 9. We define the gradient vector with 
respect to 9 as •(*,*) = « = ($$•!,..., *£?> ). 

Moreover, we denote the full and partial Jacobian of a 
vector v(t, 9) € M M with respect to 9 by 



dv{t,e) _ 

dO ~ 
v'(t,9) = 



dvj{t,9) 
dv.it, 9) 



39, 



aMxNg 



f MxN e 



(4) 



(5) 



where Vi(t,9) (i < M) is the i-th entry of v(t,9). With 



a slight abuse of notation we use v ^ x - 



dv(t,0) 
9x — 


dvi(t,0) 

dxj 


dv(t,8) 
du — 


dvi(t,0) 
8uk 



r>MxN x 



and 



dv(t,0) _ 
du 



dvj(t,0) 
dxj 

dvj(t,8) 
du k 



pMxN u 



as the full and partial 



Jacobians of v(t,9) with respect to x and u. 

For the event times T k (9), k = 1, . . . , K, the gradient with 
respect to 9 is defined as 

T fe = ( T fc.i' • 



where T kj 



d6j ■ VVe 1CT T 0,i 



'k,N e 



'K+l,j 



for all j since 



the start and end of the sample path are fixed values. 
Finally, we define r' as a N e x Ng matrix such that its 
ith row is associated with event Ei and its jth column 
is associated with the variable with respect to which the 
differentiation is done. 

In what follows, we derive a unified set of equations which 
give the event-time and state derivatives with respect to 
9 and are in concord with the results of IPA presented in 
[Cassandras et ah, 2009]. All calculations can be done in 
two generic steps, regardless of the type of event observed, 
i.e., we do not need to differentiate between exogenous, 
endogenous, and induced events. 

4. INFINITESIMAL PERTURBATION ANALYSIS 



4-.1 Event-time Derivatives 

By Theorem 1, for any k = 1,...,K, transition k is 
directly or indirectly associated with an event e G £ with 
a guard function g(-) such that g(r k ,y\,u,9) = 0. Let us 
define the guard vector 

g(t, x, u, 9) = (ffi (t, x, u, 9) , . . . , g Ne (t, x, u, 9)) . (7) 
and a unit firing vector e,, i = 1, . . . , N e as 

ei = (0,..,l,..,0)el N « 
where only the i-th element is 1 and the rest are 0. 

Let G(t, x, u, 9) be a diagonal matrix function where 
Gi.i — gi, i = 1, . . . , N e and denote its time derivative at 
t = t by G(r, x, u, 9). We can obtain r' by differentiating 
g(r, x, u, 9) = with respect to 9. This gives 

_idg(T - ,x,u,0) 



r' = -G(r - ,x,u,0) 

where 

dg(T~,x,u,9) _ dgdx(r-,9) 



d9 



(») 



d9 



dy 



d9 



+ !|u'(T-,0)+g / (T-,x,u,0). (9) 

It is easily verified that the simple equation (8) is in line 
with what has been reported in prior work on IPA for SHS, 
e.g., in [Cassandras et al., 2009]. The following assumption 
is introduced so that r' k exists: 

Assumption 4. With probability 1, if Tt is the occur- 
rence time of Ei, we have giijk, x, u, 9) ^ 0. 

Note that the case of a contact point where gi(rk, 9) does 
not exist has already been excluded by Assumption 2, 
hence, gi (r^ , x, u, 9) is always well-defined. Also, observe 
that only one row in (8) is evaluated at each transition. In 
fact, t' is a generic matrix function such that if transition 
k is associated with event Ei, only its zth row is evaluated. 



Below, we drop T from L(T,9,w) and u> from the argu- 
ments of other functions to simplify the notation. However, 
we still write L(9,uj) to stress that we carry out the 
analysis on the sample path of system G denoted by uj. 
We write (3) as 

L(9,w)=J2 i thl t(Qk,t,x,u,d)dt, 

k=0 Jrk 

Recalling Tq = t' k+1 = 0, we calculate the gradient of the 
sample cost with respect to 9 as 

dL [ °' "' ' - E [*(«*-!> T k . x > u > °) - *(«*. T k^ ". *)K 
fc=i 

- 1^ / Jn dt ( 6 ) 



rfe 



rfe 



where 



+ £'( 9fe ,i,x,u,6») 



^.S State Derivatives 

Here, we determine how the state derivatives evolve both 
at transition times and in between them, i.e., within a 
mode q € Q. 



Derivative update at transition times: 
we consider two cases: 



At each transition 



(a) No Reset: In this case, we have Xj{r k ,9) — Xj(t£,0) 
for all j G {1,. . . ,N X }. Assume that for every q e Q, 
ij(t,9) = fj(q,t,x,u,9). Then, we use the following 
equation, derived in [Cassandras et al., 2009], to update 
the state derivatives: 

4(T+,0)=^.(T fe -,0) 

+ [fjiqk-i,^,^,-!!^) - fj(q k ,Tjl,x,u,0)]T k . (10) 

(b) Reset: In this case, there exists j e {1, . . . , N x } such 
that Xj(r k ,9) ^ xj(t£,0). Let e G £ be the direct cause 
of transition (q k -i,q k ) at r k (i.e., e appears on the arc 
connecting q k -i and q k in the automaton). We then define 



a reset condition xj{t£,9) = rj(qk-i,qk, x , u, 9,e) where 
Tj{-) is the reset function of Xj. Thus, we get 



where 

dr'j(q k -i,qk,x-, u, 0, e 



d7j(g fc _i,g fe ,x,u,0,e) 



r/0 



(11) 



9r 



9r,- 



du ox ou 

+ r' j {q k -i,q k ,x,u,9,e). (12) 

For other transitions which are indirectly caused by an 
event, we simply define r(q k _i,q k ,x., u, 0) = x(rfc,0) as 
no reset is possible on them. 

To put everything in matrix form, let us first define, for 

every i = 1 , . . . , N e , the index set 

$j = {(m,n) £ QxQ : 3x, u e X,U s.t.(f>(m,x,u,Ei) = n} 

containing all transitions directly associated with Ei . Also, 
for each i, let us define the reset mapping 

, s f rim, n, x, u, 0, Ei) if (to, n) € $; 
v y I x otherwise 

and the diagonal matrix C(m, n) € R-^*^ with its jth 
diagonal entry Cjj(m,n) = 1 if aij is not reset by transi- 
tion (to, n) and 0, otherwise. We also define C(m,n) = 
1-NvXNsc — C(m, n) where In^xn^ is the identity matrix 
with the specified dimensions. Moreover, let us define the 
reset map matrix as 

ri (%-!,%) 



R(r fc ) 



iWeXjV;, 



(13) 



_rjVe(Q'fe-i.Q , fc)_ 

where Xj(t^) — ri Aq k -.\, q k ) when the fcth transition 
is due to E{ and Xj(r k ) — Xj(r k ), otherwise. Thus, 
if x remains continuous at its occurrence time we get 
Ti{q k -i,q k ) = x(r k ,0). Finally, we define the short- 
hand notation Af(q k _i,q k ,9) = f(g fe _i,r^",x, u, 0) - 
f (q k , r k , x, u, 0) for the jump in the dynamics at the kth 
transition. Using these definitions we can combine part (a) 
and (b) above and write 

x'(r+) = C( 9fc _ 1 ,g fc )[x'(T fe -) + Af (q k - U q k , 9) T r' k ] 

+ (e i R(T h )C{q k -i,q k ))'. (14) 

Derivative update between transitions: Assuming the 
mode is q k , we only need to perform the following op- 
eration on interval [r k ,T k+ i): 

I, + a\ , /"* rff(<7fc,T",X,U,0) 



x'(i,0) 



fi0 



-dr 



(15) 



where — 'd6> X ' U ' i s a ^ic x -^e Jacobian matrix of the 
state dynamics defined on [T k ,T k +\) as 



df(qk,T, x, u, 9) df dx(r, 0) <9f 



r/0 



9x d0 9u 

+ f'(q fe ,T,x,u,0 



u'(r,0) 



(16) 



To sum up, the basis for IPA on a system modeled as G is 
the pre-calculation of the quantities J| , M^, ||, ^ for all 

<7 G Q, ^-, ^ and g|, which are then used in (8), (14) 
and (15) to update IPA derivatives. Finally, the results are 
applied to (6). We will next apply this method to a specific 
problem of interest [Cassandras and Lafortune, 2006] in 
the following section. 



5. IPA FOR A SINGLE-NODE SFM 

In what follows, we apply the method described above to 
a simple single-class single-node system shown in Fig. 3. 
We use a simplified notation here for space limitations. 



Arrivals 



Losses 



X(t,6) 



D 



Departiies 

► 



Fig. 3. The DES model of the single node system. 

However, the dependence of functions on their arguments 
should be clear from the analysis in the previous section. 

The system consists of a queue whose content level X(t, 9) 
is subject to stochastic arrival and service time processes. 
The queue capacity is limited to a quantity 9 treated as 
the control parameter. Every arrival seeing a full queue is 
lost and incurs a penalty. Considering this system over a 
finite interval [0, T], we want to find the best 9 to trade off 
between the average workload and average loss defined as 



E u [Q DES (T,d,u)} = ^E u 



1 



X{t,9,uj)dt 



E[L DES (T,e,uj)] = -E u A ioss (T,0,w) 



(17) 



(18) 



where Ni oss (-) is the number of losses observed in the 
interval [0, T]. Even for a simple system like this, the 
analysis can become prohibitive when the stochastic pro- 
cesses considered are arbitrary. Use of SFMs has proven to 
be very helpful in the analysis and optimization of queu- 
ing systems such as this one [Cassandras and Lafortune, 
2006] , [Cassandras et al., 2002] where applying IPA has 
resulted in very simple derivative estimates of the loss and 
workload objectives with respect to 9. In the SFM, the 
arrivals and departures are abstracted into non-negative 
stochastic inflow rate {a(t)} and maximal service rate 
{(3(t)} processes which are independent of 9. These rates 
continuously evolve according to the differential equations 
Q = fa{t) and $ — fp(t) where f a and fp are arbitrary 
continuous functions. It is important to observe that the 
precise nature of these functions turns out to be irrelevant 
as far as the resulting IPA estimator is concerned: the 
IPA estimators are independent of f a and fp. This is an 
important robustness property of IPA estimators which 
holds under certain conditions [Yao and Cassandras, 2011]. 
We allow for discrete jumps in both processes {a(t)} and 
{/3(t)} and use timer states y ai y/3 > to capture them. 
The fluid discharge rate d(t, 9) is defined as d(t, 9) = /3(i) 
when x(t, 9) > and d(t, 9) = a(t) otherwise. For this 
example, Assumption 3 manifests itself as follows: 

Assumption 5. With probability 1, condition a(t) = 
f3(t) cannot be valid on a non-empty interval containing t. 

The buffer content process evolves according to the differ- 
ential equation x{t, 9) = a(t) — d{t, 9) so we can write 

if x(t, 9) = or 6 

a(t) — /3(t) otherwise 

(19) 
When x(t, 9) reaches the buffer capacity level 9, a portion 
of the incoming flow is rejected with rate a(t) — /3(t) > 0. 



x(t,9) = f x (t 7 i 



Obviously, when x(t, 9) < 9 no loss occurs. Hence, we 
define, for every t € [0, T], the loss rate as 

£ ^' °) = \ a (t) - (3(t) otherwise ( 20 ) 

The SHA of this system is shown in Fig. 4. We define the 



E,,[a-^>0,)i(t-)-0] 




[« - < 0,y s ((-) = 0] 



Fig. 4. Stochastic Flow Model (SFM) for the single-class SFM. 

system state vector x = (a, /?, x, y a , yp) and the input u = 

({A,}, {U,"}, {Bk}, {V£ }) whose elements are sequences of 
random variables from the jump distributions associated 
with states a, y a , /3, and yp, respectively. Although it 
follows from the definitions that the states a, /?, y a , yp are 
independent of 9 yielding a'(t) = f3'{t) = y' a (t) = y'Jt) = 
for all t <E [0, T], we include them at the start of the IPA 
estimation procedure to fully illustrate the matrix setting 
we have developed and make use of it later. Regarding the 
dynamics of the states, we have 





(/a, //S, 0,-1,-1) 



if q 



-1) 



fg = { (fa, } 'p, OL- /3,-l 
[ (f a , ff}, 0,-1,-1) 

The workload and loss sample functions are 



if q = \ 

otherwise 



(21) 



Q(T, 



L(T, 



x(t, 9, uf)dt, 



£(t,9,u)dt. 



Using the fact that x(t, 9) and £(t) can only contribute 
to their associated objectives when, respectively, x > 
(q = 1,2) and x = 9 (q = 

N 

E 



Q(T,0,w) 



1 
T 



2), we can write: 
x(t, 9,uj)dt, 



L(T,0,u) 



N M„ 
n— 1 m— 1 



(0) 



1 



.(«) 



>(9) 



£(t,9,co)dt (23) 



where AT is the number of supremal intervals [£n,?7 n ), 
n = 1, ...,N over which x(t,9,cj) > (i.e., g = 1,2) 



^n,m; u n,m ) i 



and M n is the number of supremal intervals 
m = 1,...,M„ such that a;(t,0,w) = 9 (i.e., g = 2). We 
refer to the intervals of the first kind as Non-Empty Periods 
(NEPs) and the second kind as Full Periods (FPs) . We also 
drop the sample path index u> to simplify the notation. 

Differentiating (22) with respect to 9 and noting x(£_ n (9) + ) - 
x(r/ n (6)~) = for any n gives 

N r 



Next, differentiating (23) with respect to 9 and noticing 
that by (20), £'(t,9) = for all t E [0,T) (eliminating the 
integral part), reveals 

N M„ 

L'(T, e) = Y,J2 Km^n,m, 0) ~ <m*«m. &)]■ (25) 
n— 1 m— 1 

It is now clear that to evaluate (24) and (25) we only need 
to obtain x'(t,9) for all t € [(„,?]„), n = 1,...,N and 
event time derivatives v' n m , cr^ m for every m = 1, . . . , M„ 
where n = 1, . . . , N. However, as mentioned before, we try 
to keep everything in the general matrix framework so as 
to verify its effectiveness. 

According to Fig. 4, the event set of the SHS is given as 

E = {Ei,i = l,...,5}. 
with guard functions defined as follows: E\ occurs when 
g\ = a — j3 — 0. E2 is the event of reaching the buffer 
threshold 9, so that 52 = x — 9. E3 is the event ending a 
non-empty period, hence g% — x. Finally, E4 and E4 are 
associated with the timer run-offs captured by 34 = y a 
and 35 — yp, respectively. To summarize, the guard vector 
for the system is 

e(t,0) = (a{t)-^t),x{t,e)-e,x{t,e),y a {t) t yp(t)). (26) 
The reset maps are defined as follows: 



Ti(m, n) 
where A a 



r(A J -,/?,£,l/ a ,y /3 ) ifi = 4,m 



n€{0,l,2} 

n €{0,1,2} 



,(a, B kl x,y a , Vj:) ifi = 5,m 

otherwise 
i:) emu 3^ are, respectively, the jth and fcth 

elements of random sequences {Aj} and {Bk}- Using these 

results in (13) we get 

x(Tfc) 

x(rfe) 

Rfa)= x(r fe ) . (27) 

_r 5 (g fe _i,g fe ) 

By (14), only the reset conditions associated with discon- 
tinuous states need to be differentiated with respect to 
9. Since x, the only state variable which depends of 9, is 
continuous, we conclude that the last term in (14) is always 
and need not be evaluated. 



(22) The IPA starts by evaluating (8). Note that by (21) 
for all q E {0,1,2}. Thus, we only need to 



df q (t) 

evaluate (9). Also, by definition, ^ = 0, so gj need not 
be evaluated. Then, from (26), we are left with 



dg 

<9x 



-1 


0" 





1 





1 





1 





1 



g'(M) = 



It follows from (9) that 



Q'(T, 



ji 2-^i 



n=l 



x^yWJO) - x(U9) + K(0) 



dg(T-,0) 
d9 







VnW 



ln(fl) 



x'(t,9)dt 



1 N 

rp 2-^1 



Vn(6) 



i n W 



x'(t,0)dt, (24) 



\a'(T-)-/3'(r-)\ 
x'(t-)-1 

x'(t-) 

y' a (r-) 

where we used the fact that a, /3, y a and yp are indepen- 
dent of 9. Moreover, by (26), we have the time derivative 
of the guard functions just before the fcth transition as 



x'{t-) - 1 

x'(r-) 







G(r-,e)=diag(/ a (r-)-^(r-),i(r-),x(r-),-l,-l) 

and since E 2 and E3 are only feasible when x > 
(q = 1,2), by (19) we get x(t~) = a(r _ ) - /3(r _ ) ^ in 
the above expression. Combining the results in (8) gives 



0, 



cc'(r-) - 1 



.x'(r-) 



,0,0 .(28) 



Afx(qk-i,Qk) 



a(r-)-/3(r-) a(r-)-/3(r-)' 
Next, we determine the state derivatives with respect to 9. 
Since x is the only state dependent on 9, we only apply (14) 
and (15) to x(t, 9). We also need not evaluate C(q k _i,q k ), 
k = 1,...,K as x(t,6) is continuous throughout [0, T]. 
Moreover, we need to determine Af x (q k -i,q k ) for any 
feasible transition (qk-i,Qk)- If wc define A(r) = ot(r) — 
(3(t) and AA(r) = A(r _ ) - A(r+) we get 

f-A(r+) if (g fc _i,g fc )e{(0,l),(2,l)} 

lA(r-) if (g fc -i,g fc )G{(l,0),(l,2)} 

•]AA(7*) if (g fc -i,g fc ) = (l,l) 

1,0 otherwise 

Invoking (14) for x(t, 9) gives 

x'(T+,9)=x'(T-,9) + Af x (q k _ liqk ,9)T' k . (29) 

By (28), we only need to take care of the transitions 
caused by E 2 and E3 as in other cases Af x (q k _i,q k , 9)r' k = 
0. Since neither E2 nor E3 appear in the transitions 
with resets, they cannot create a chain of simultaneous 
transitions, thereby leaving us with transition (1, 2) for E 2 
and (1, 0) for E3. In the first case, we get Af x (l, 2, 9)r' k = 
1 — x'(tj7 , 9) and in the latter case we get Af x (l, 0, 9)r' k = 
— x'{r k ,9). Inserting these results in (29) yields 

1 if (q k -i,q k ) = (1,2) 

if (q k -i,qk) = (1,0) (30) 

x'(T k ,6) otherwise 



x'(r 



fe >' 



There is no need to consider (15) in this case, since 



dfg 

,10 







for all q. Therefore, we are in the position to fully evaluate 
the sample derivative estimates (24) and (25). 

By (30), after the buffer becomes empty (transition (1,0) 
through event E 3 ), x'(t,9) becomes and stays at until a 
transition (1, 2) occurs through E 2 . If this happens, x'(t, 9) 
resets to 1 in (30) and remains constant until the buffer 
becomes empty again. Therefore, we need only consider 
those nonempty periods [£„, rj n ) in which a transition (1, 2) 
occurs. If this happens, we calculate the length of the 
interval between the first such transition until the next 
time the buffer becomes empty. Q'(T, 9) in (24) is the sum 
of lengths of these intervals, i.e., 



1 N 

Q'{T,e) = -Y, 1 Fp{n){71n-V n ,l) 

71=1 



(31) 



where lpp(n) = 1 if there exists a transition (1, 2) in the 
non-empty period [£n,?7n) and otherwise. 

Next, to evaluate (25), notice that at t = o~ n ,m (end 
of stay at q = 2 in Fig. 4) a transition to q = 1 can 
occur in two ways: (a) Through E4 or E5 (transition 
(2,2)) and violating the invariant condition [a > /3] which 
immediately fires transition (2,1); (b) Directly, by E\ 
(transition (2,1)). These three possibilities are associated 
with zeros in (28), so wc have a' n m — 0. Regarding the 

is E 2 . By 



term — v' n m l{y^ m , 6) in (25), the event at u 

x'(v~ m ,6)-l 



(28), we have v' nm = 



Since by (20), 



t-{ v Z,m-,6) = a(f+ m ) - P(v+ m ) and by Assumption 2, 

a ( v n, m ) - P{v£,m) = a ( v n, m ) - ^n,m). we find that 

~ v'n.m^ivn.mi ®) = x \ v n m> ^) — 1- We have already shown 
in (30) that in a non-empty period [£n,?7n), x'(t,9) = 
for all t £ [imVn^i) and x'(t,6) = 1, t € [v n ,i,iln)- Hence, 
x'(v~ m , 9) — when m = 1 and x'{v~ m , 6) = 1, otherwise. 
Combining all results into (25), we find that 

N M„ 

L'(T,9) = J2 E -^, m ^« m ^) = -^f 

n— 1 m—1 

where iVjr is the number of non-empty intervals with at 
least one full period. These results recover those in [Cas- 
sandras and Lafortune, 2006, pp. 700-703] and [Cassan- 
dras ct al., 2002]. Note that Q'(T, 9) and L'(T, 9) are inde- 
pendent of f a and fp, i.e., these sensitivity estimates are 
independent of the random arrival and service processes, 
a fundamental robustness property of IPA. 

6. CONCLUSIONS 

We have introduced a general framework suitable for 
analysis and on-line optimization of Stochastic Hybrid 
Systems (SHS) which facilitates the use of Infinitesimal 
Perturbation Analysis (IPA). In doing so, we modified 
the previous structure of a Stochastic Hybrid Automaton 
(SHA) and showed that every transition is associated with 
an explicit event which is defined through a guard function. 
This also enables us to uniformly treat all events observed 
on the sample path of the SHS and makes it possible 
to develop a unifying matrix notation for IPA equations 
which eliminates the need for the case-by-case analysis 
based on event classes as in prior work involving IPA for 
SHS. 
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